A* Algorithms for the Constrained Multiple Sequence Alignment Problem
نویسندگان
چکیده
It is well known that A algorithm reduces the search space in many applications. For shortest path computations, the algorithm uses a heuristic estimator which can better guide the search to the destination. It allocates memory dynamically to store only the necessary vertices, which are those vertices in its open list and close list. Therefore, an A algorithm for the shortest paths search problem is much more space efficient than ordinary search algorithm such as the dynamic programming algorithm and the Dijkstra algorithm. The constrained multiple sequence alignment problem (CMSA) aims to align similar subsequences in the same region under the guidance of a given pattern (constraint). The CMSA problem can be considered as an optimal path search problem in the dynamic programming matrix. In this paper, we propose two A-based algorithms which we experimentally show that in practice they solve the CMSA problem using much less memory than does the ordinary dynamic programming algorithm.
منابع مشابه
An Application of the ABS LX Algorithm to Multiple Sequence Alignment
We present an application of ABS algorithms for multiple sequence alignment (MSA). The Markov decision process (MDP) based model leads to a linear programming problem (LPP), whose solution is linked to a suggested alignment. The important features of our work include the facility of alignment of multiple sequences simultaneously and no limit for the length of the sequences. Our goal here is to ...
متن کاملA Cuckoo search algorithm (CSA) for Precedence Constrained Sequencing Problem (PCSP)
Precedence constrained sequencing problem (PCSP) is related to locate the optimal sequence with the shortest traveling time among all feasible sequences. In PCSP, precedence relations determine sequence of traveling between any two nodes. Various methods and algorithms for effectively solving the PCSP have been suggested. In this paper we propose a cuckoo search algorithm (CSA) for effectively ...
متن کاملgpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملRepeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملA Fast Algorithm for the Constrained Multiple Sequence Alignment Problem
Given n strings S1, S2, ..., Sn, and a pattern string P , the constrained multiple sequence alignment (CMSA) problem is to find an optimal multiple alignment of S1, S2, . . . , Sn such that the alignment contains P , i.e. in the alignment matrix there exists a sequence of columns each entirely composed of symbol P [k] for every k, where P [k] is the kth symbol in P , 1 ≤ k ≤ |P |, and in the se...
متن کاملSpace-Efficient Parallel Algorithms for the Constrained Multiple Sequence Alignment Problem
Given sequences S1, S2, . . . Sn, and a pattern string P the constrained multiple sequence alignment problem (CMSA) is to align similar subsequences of these sequences with the constraint that the alignment “contains” P . The CMSA problem can be considered as an optimal path search problem in the dynamic programming matrix. The problem has a dynamic programming solution that requires O(2|S1||S2...
متن کامل